Model Selection for Regression on a Random Design ∗
نویسندگان
چکیده
We consider the problem of estimating an unknown regression function when the design is random with values in k . Our estimation procedure is based on model selection and does not rely on any prior information on the target function. We start with a collection of linear functional spaces and build, on a data selected space among this collection, the least-squares estimator. We study the performance of an estimator which is obtained by modifying this least-squares estimator on a set of small probability. For the so-defined estimator, we establish nonasymptotic risk bounds that can be related to oracle inequalities. As a consequence of these, we show that our estimator possesses adaptive properties in the minimax sense over large families of Besov balls Bα,l,∞(R) with R > 0, l ≥ 1 and α > αl where αl is a positive number satisfying 1/l − 1/2 ≤ αl < 1/l. We also study the particular case where the regression function is additive and then obtain an additive estimator which converges at the same rate as it does when k = 1. Mathematics Subject Classification. 62G07, 62J02. Received March 28, 2001. Revised May 5, 2002. Introduction Let A be some subset of R. We consider the problem of estimating on A the unknown function s mapping R k into R in the following regression framework: Yi = s(Xi) + ξi i = 1, · · · , n. (1) The Xi’s are independent random variables with values in A and the ξi’s are i.i.d. zero mean random variables admitting a finite variance denoted by σ. For simplicity, we assume all along that σ is known. However, the results contained in this paper would only be slightly modified by replacing σ by some suitable estimator as, for example, the one proposed in Baraud [1] (Sect. 6). Throughout this paper, we assume that the sequences of Xi’s and ξi’s are independent. For each i ∈ {1, · · · , n} we denote by μi the distribution of Xi and set μ = n−1 ∑n i=1 μi. By assuming that the Xi’s are not necessarily identically distributed we have in mind to handle the particular case of deterministic Xi’s for which μi = δXi . Throughout the paper we fix some reference measure ν supported on A. Unlike μ, we suppose that ν is known and our assumptions concern ν. We equip
منابع مشابه
Application of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives
Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...
متن کاملApplication of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives
Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...
متن کاملEstimation of Genetic Trends for Test-Day Milk Yield by the Logarithmic Form of Wood Function Using a Random Regression Model
Estimation of genetic trends is necessary to monitor and evaluate selection programs. The objective of this study was to estimate the genetic trends for milk yield in Iranian Holsteins cows using random regression test day model. Data set was consisted of 743205 test-day records from 1991 to 2008, which were collected by the Animal Breeding Centre of Iran. Breeding, environmental and phenotypic...
متن کاملPixel selection by successive projections algorithm method in multivariate image analysis for a QSAR study of antimicrobial activity for cephalosporins and design new cephalosporins
Thirty-one Cephalosporin compounds were modeled using the multivariate image analysis and applied to the quantitative structure activity relationship (MIA-QSAR) approach. The acid dissociation constants (pKa) of cephalosporins play a fundamental role in the mechanism of activity of cephalosporins. The antimicrobial activity of cephalosporins was related to their first pKa by different models. B...
متن کاملPixel selection by successive projections algorithm method in multivariate image analysis for a QSAR study of antimicrobial activity for cephalosporins and design new cephalosporins
Thirty-one Cephalosporin compounds were modeled using the multivariate image analysis and applied to the quantitative structure activity relationship (MIA-QSAR) approach. The acid dissociation constants (pKa) of cephalosporins play a fundamental role in the mechanism of activity of cephalosporins. The antimicrobial activity of cephalosporins was related to their first pKa by different models. B...
متن کاملEstimation of Variance Components for Body Weight of Moghani Sheep Using B-Spline Random Regression Models
The aim of the present study was the estimation of (co) variance components and genetic parameters for body weight of Moghani sheep, using random regression models based on B-Splines functions. The data set included 9165 body weight records from 60 to 360 days of age from 2811 Moghani sheep, collected between 1994 to 2013 from Jafar-Abad Animal Research and Breeding Institute, Ardabil province,...
متن کامل